JBoss Community Archive (Read Only)

ModeShape 5

PDF files

This is available starting with 5.1

A sequencer which supports application/pdf files, for which it can extract the following metadata information:

//------------------------------------------------------------------------------
// N A M E S P A C E S
//------------------------------------------------------------------------------
<jcr='http://www.jcp.org/jcr/1.0'>
<nt='http://www.jcp.org/jcr/nt/1.0'>
<mix='http://www.jcp.org/jcr/mix/1.0'>
<pdf='http://www.modeshape.org/pdf/1.0'>
<xmp='http://www.modeshape.org/xmp/1.0/'>

//------------------------------------------------------------------------------
// N O D E T Y P E S
//------------------------------------------------------------------------------

[pdf:metadata] > nt:unstructured, mix:mimeType
  - pdf:pageCount (long) mandatory
  - pdf:encrypted (boolean) mandatory
  - pdf:version (string) mandatory
  - pdf:orientation (string) mandatory
    < 'portrait', 'landscape', 'reverse landscape'
  - pdf:author (string)
  - pdf:creationDate (date)
  - pdf:creator (string)
  - pdf:keywords (string)
  - pdf:modificationDate (date)
  - pdf:producer (string)
  - pdf:subject (string)
  - pdf:title (string)
  + pdf:xmp (pdf:xmp)
  + pdf:page (pdf:page)

[pdf:page]
  - pdf:pageNumber (long) mandatory
  + pdf:attachment (pdf:attachment) = pdf:attachment

[pdf:attachment] > mix:mimeType
  - pdf:creationDate (date)
  - pdf:modificationDate (date)
  - pdf:subject (string)
  - pdf:name (string)
  - jcr:data (binary)

[pdf:xmp]
  - xmp:baseURL (string)
  - xmp:createDate (date)
  - xmp:creatorTool (string)
  - xmp:identifier (string) *
  - xmp:metadataDate (date)
  - xmp:modifyDate (date)
  - xmp:nickname (string)
  - xmp:rating (string)
  - xmp:label (string)

You can configure it in embedded mode like so:

{
    "name" : "PDF Sequencer Test Repository",
    "sequencing" : {
        "sequencers" : {
            "PDF sequencer" :  {
                "classname" : "pdf",
                "pathExpressions" : [ "default:/(*.pdf)/jcr:content[@jcr:data] => /sequenced/pdf" ]
            }
        }
    }
}

or in JBoss AS like so:

<sequencer name="pdf-sequencer" classname="pdf" module="org.modeshape.sequencer.pdf">
   <path-expression>/files(//*.pdf[*])/jcr:content[@jcr:data] => /derived/pdf/$1</path-expression>
</sequencer>
JBoss.org Content Archive (Read Only), exported from JBoss Community Documentation Editor at 2020-03-11 12:12:59 UTC, last content change 2016-05-20 09:52:17 UTC.